READ_CSV

The READ_CSV function reads data from a “comma-separated value” (comma-delimited) text file or URL into an IDL structure variable.

This routine handles CSV files consisting of one or more optional table header lines, followed by one optional column header line, followed by columnar data, with commas separating each field. Each row is assumed to be a new record. Blank lines at the beginning of the file are automatically ignored.

This routine is written in the IDL language. Its source code can be found in the file read_csv.pro in the lib subdirectory of the IDL distribution.

Example

Using DIALOG_PICKFILE, you will locate a specific file on your computer, read it in, and display some information about it.

; Use DIALOG_PICKFILE to locate 'ScatterplotData.csv.'

; and assign it a name ('testfile')

testfile = DIALOG_PICKFILE(FILTER='*.csv')

Navigate to C:\Program Files\***\IDLxx\examples\data and open the file ScatterplotData.csv.

; Read the file in and assign it to the sed_data variable;

; assign the header information to variables for use later.

sed_data = READ_CSV(testfile, HEADER=SedHeader, $

N_TABLE_HEADER=1, TABLE_HEADER=SedTableHeader)

; Display the field names found in the header.

PRINT, SedHeader

PRINT, SedTableHeader

IDL displays the field names and the table heading separately:

Distance from Terminus (meters) Mean Particle size (mm) Sedimentation Rate (g/cm2yr)

2012 Simulated Sediment Distribution at the terminus of SE Alaskan Tidewater Glaciers

Display additional information about the file:

; Show information on the file structure

HELP, sed_data, /STRUCTURES

IDL displays:

** Structure <12801420>, 3 tags, length=880, data length=880, refs=1:

FIELD1 LONG Array[44]

FIELD2 DOUBLE Array[44]

FIELD3 DOUBLE Array[44]

Syntax

Result = READ_CSV( Filename [, COUNT=variable] [, HEADER=variable] [, MISSING_VALUE=value] [, N_TABLE_HEADER=value] [, NUM_RECORDS=value] [, RECORD_START=value] [, TABLE_HEADER=variable] [, TYPES=value] )

Return Value

Returns a structure variable that contains one structure field for each column of data in the CSV file. The data type of the individual structure fields is determined using the following rules:

Resulting Data Type	If this condition is met
Long integer	All data within the column consists of integers, all of which are smaller than the largest 32-bit signed integer.
64-bit Long integer	All data within the column consists of integers, at least one of which is greater than the largest 32-bit signed integer.
Double-precision floating point	All data within the column consists of numbers, at least one of which has a decimal point or exponent.
String	Any column that does not meet one of the above conditions.

Note: When determining the data type the first 100 and last 100 elements in each column are examined. You can use the TYPES keyword to override the automatic type checking.

Arguments

Filename

A string containing the name of a CSV file to be read. Filename may be a local file or a URL. If a URL is specified, it must contain the scheme, host, and full path to the CSV file on that server.

Keywords

COUNT

Set this keyword equal to a named variable that will contain the number of records read.

HEADER

Set this keyword equal to a named variable that will contain the column headers as a vector of strings. If no header exists, an empty scalar string is returned.

Note: The headers will only be returned for CSV files that contain numeric data. If the CSV file contains only string data, the header (if it exists) will be contained in the first data record.

MISSING_VALUE

Set this keyword equal to a value used to replace any missing floating-point or integer data. The default value is 0.

N_TABLE_HEADER

Set this keyword equal to the number of lines of the CSV file to skip before beginning to read records from the file. (If a column header line is present, it will not be skipped.) If the TABLE_HEADER keyword specifies a variable to hold table header information, each skipped line will be stored as a single string in that variable.

If the HEADER keyword is also present, the column headers will be taken from the line following the line specified by N_TABLE_HEADER.

NUM_RECORDS

Set this keyword equal to an integer specifying the number of records to read. By default all records are read.

RECORD_START

Set this keyword equal an integer specifying the index of the first record to read. The default is the first record of the file (record 0).

TABLE_HEADER

Set this keyword equal to a named variable that will contain the file's table header information as a vector of strings. The number of lines specified by the N_TABLE_HEADER keyword will be interpreted as the table header. Table header information appears in the CSV file above the column headers specified by the HEADER keyword (if present).

TYPES

Set this keyword to a string array containing the IDL data types for each column of data. If TYPES is missing or there are more columns than elements in TYPES then IDL automatically determines the data type for the extra columns using the rules given under Return Value above. Possible values for data types are:

"" - An empty string indicates that IDL should automatically determine the data type for that column
"Byte" - Byte data
"Int" - 16-bit signed integer data
"Long" - 32-bit signed integer data
"Float" - 32-bit floating-point data
"Double" - 64-bit floating-point data
"Uint" - 16-bit unsigned integer data
"Ulong"- 32-bit unsigned integer data
"Long64"- 64-bit signed integer data
"Ulong64"- 64-bit unsigned integer data
"String", "Date", "Time", or "Datetime" - String data

Version History

7.1	Introduced
8.0	N_TABLE_HEADER and TABLE_HEADER keywords added
8.5	Added TYPES keyword. Added support for URL files.